智能论文笔记

Pattern recognition in the nucleation kinetics of non-equilibrium self-assembly

Constantine Glen Evans , Jackson O'Brien , Erik Winfree , Arvind Murugan

分类：神经与进化计算

2022-07-13

受生物学最复杂的计算机的启发，大脑，神经网络构成了计算原理的深刻重新重新制定。值得注意的是，在活细胞内部的信息处理分子系统（例如信号转导级联和遗传调节网络）内，在信息处理的分子系统中也出现了类似的高维，高度相关的计算体系结构。在其他物理和化学过程中，即使表面上扮演非信息处理的角色，例如蛋白质合成，代谢或结构自组装等表面上，神经形态集体模式是否会更广泛地发现。在这里，我们检查了多组分结构自组装过程中的成核，表明可以以类似于神经网络计算的方式对高维浓度模式进行区分和分类。具体而言，我们设计了一组917个DNA瓷砖，可以以三种替代方式自组装，从而使竞争成核敏感地取决于三个结构中高分化瓷砖共定位的程度。该系统经过训练，以将18个灰度30 x 30像素图像分为三类。在150小时的退火过程中和之后，在实验上，荧光和原子力显微镜监测确定所有训练有素的图像均正确分类，而一组图像变化集探测了结果的鲁棒性。尽管与先前的生化神经网络相比缓慢，但我们的方法令人惊讶地紧凑，健壮且可扩展。这种成功表明，无处不在的物理现象（例如成核）在将高维多分量系统缩放时可能具有强大的信息处理能力。

translated by 谷歌翻译

SoK: Let The Privacy Games Begin! A Unified Treatment of Data Inference Privacy in Machine Learning

Ahmed Salem , Giovanni Cherubin , David Evans , Boris Köpf , Andrew Paverd , Anshuman Suri , Shruti Tople , Santiago Zanella-Béguelin

分类：机器学习

2022-12-21

Deploying machine learning models in production may allow adversaries to infer sensitive information about training data. There is a vast literature analyzing different types of inference risks, ranging from membership inference to reconstruction attacks. Inspired by the success of games (i.e., probabilistic experiments) to study security properties in cryptography, some authors describe privacy inference risks in machine learning using a similar game-based style. However, adversary capabilities and goals are often stated in subtly different ways from one presentation to the other, which makes it hard to relate and compose results. In this paper, we present a game-based framework to systematize the body of knowledge on privacy inference risks in machine learning.

translated by 谷歌翻译

LR-Sum: Summarization for Less-Resourced Languages

Chester Palen-Michel , Constantine Lignos

分类：自然语言处理

2022-12-19

This preprint describes work in progress on LR-Sum, a new permissively-licensed dataset created with the goal of enabling further research in automatic summarization for less-resourced languages. LR-Sum contains human-written summaries for 40 languages, many of which are less-resourced. We describe our process for extracting and filtering the dataset from the Multilingual Open Text corpus (Palen-Michel et al., 2022). The source data is public domain newswire collected from from Voice of America websites, and LR-Sum is released under a Creative Commons license (CC BY 4.0), making it one of the most openly-licensed multilingual summarization datasets. We describe how we plan to use the data for modeling experiments and discuss limitations of the dataset.

translated by 谷歌翻译

Dissecting Distribution Inference

Anshuman Suri , Yifu Lu , Yanjin Chen , David Evans

分类：机器学习 | 人工智能

2022-12-15

A distribution inference attack aims to infer statistical properties of data used to train machine learning models. These attacks are sometimes surprisingly potent, but the factors that impact distribution inference risk are not well understood and demonstrated attacks often rely on strong and unrealistic assumptions such as full knowledge of training environments even in supposedly black-box threat scenarios. To improve understanding of distribution inference risks, we develop a new black-box attack that even outperforms the best known white-box attack in most settings. Using this new attack, we evaluate distribution inference risk while relaxing a variety of assumptions about the adversary's knowledge under black-box access, like known model architectures and label-only access. Finally, we evaluate the effectiveness of previously proposed defenses and introduce new defenses. We find that although noise-based defenses appear to be ineffective, a simple re-sampling defense can be highly effective. Code is available at https://github.com/iamgroot42/dissecting_distribution_inference

translated by 谷歌翻译

Statistical Safety and Robustness Guarantees for Feedback Motion Planning of Unknown Underactuated Stochastic Systems

Craig Knuth , Glen Chou , Jamie Reese , Joe Moore

分类：机器人 | 机器学习

2022-12-13

We present a method for providing statistical guarantees on runtime safety and goal reachability for integrated planning and control of a class of systems with unknown nonlinear stochastic underactuated dynamics. Specifically, given a dynamics dataset, our method jointly learns a mean dynamics model, a spatially-varying disturbance bound that captures the effect of noise and model mismatch, and a feedback controller based on contraction theory that stabilizes the learned dynamics. We propose a sampling-based planner that uses the mean dynamics model and simultaneously bounds the closed-loop tracking error via a learned disturbance bound. We employ techniques from Extreme Value Theory (EVT) to estimate, to a specified level of confidence, several constants which characterize the learned components and govern the size of the tracking error bound. This ensures plans are guaranteed to be safely tracked at runtime. We validate that our guarantees translate to empirical safety in simulation on a 10D quadrotor, and in the real world on a physical CrazyFlie quadrotor and Clearpath Jackal robot, whereas baselines that ignore the model error and stochasticity are unsafe.

translated by 谷歌翻译

Doubly Robust Kernel Statistics for Testing Distributional Treatment Effects Even Under One Sided Overlap

Jake Fawkes , Robert Hu , Robin J. Evans , Dino Sejdinovic

分类： (统计)机器学习 | 机器学习

2022-12-09

As causal inference becomes more widespread the importance of having good tools to test for causal effects increases. In this work we focus on the problem of testing for causal effects that manifest in a difference in distribution for treatment and control. We build on work applying kernel methods to causality, considering the previously introduced Counterfactual Mean Embedding framework (\textsc{CfME}). We improve on this by proposing the \emph{Doubly Robust Counterfactual Mean Embedding} (\textsc{DR-CfME}), which has better theoretical properties than its predecessor by leveraging semiparametric theory. This leads us to propose new kernel based test statistics for distributional effects which are based upon doubly robust estimators of treatment effects. We propose two test statistics, one which is a direct improvement on previous work and one which can be applied even when the support of the treatment arm is a subset of that of the control arm. We demonstrate the validity of our methods on simulated and real-world data, as well as giving an application in off-policy evaluation.

translated by 谷歌翻译

MobilePTX: Sparse Coding for Pneumothorax Detection Given Limited Training Examples

Darryl Hannan , Steven C. Nesbit , Ximing Wen , Glen Smith , Qiao Zhang , Alberto Goffi , Vincent Chan , Michael J. Morris , John C. Hunninghake , Nicholas E. Villalobos

分类：计算机视觉

2022-12-06

Point-of-Care Ultrasound (POCUS) refers to clinician-performed and interpreted ultrasonography at the patient's bedside. Interpreting these images requires a high level of expertise, which may not be available during emergencies. In this paper, we support POCUS by developing classifiers that can aid medical professionals by diagnosing whether or not a patient has pneumothorax. We decomposed the task into multiple steps, using YOLOv4 to extract relevant regions of the video and a 3D sparse coding model to represent video features. Given the difficulty in acquiring positive training videos, we trained a small-data classifier with a maximum of 15 positive and 32 negative examples. To counteract this limitation, we leveraged subject matter expert (SME) knowledge to limit the hypothesis space, thus reducing the cost of data collection. We present results using two lung ultrasound datasets and demonstrate that our model is capable of achieving performance on par with SMEs in pneumothorax identification. We then developed an iOS application that runs our full system in less than 4 seconds on an iPad Pro, and less than 8 seconds on an iPhone 13 Pro, labeling key regions in the lung sonogram to provide interpretable diagnoses.

translated by 谷歌翻译

D-ITAGS: A Dynamic Interleaved Approach to Resilient Task Allocation, Scheduling, and Motion Planning

Glen Neville , Sonia Chernova , Harish Ravichandar

分类：机器人

2022-09-27

复杂的多目标任务需要在多个相互连接的级别（例如联盟形成，调度和运动计划）上协调异质机器人。动态变化（例如传感器和执行器故障，通信损失和意外延迟）加剧了这一挑战。我们将动态迭代任务分配图搜索（D-ITAGS）介绍到\ textit {同时}地址在涉及异构团队的动态设置中，地址为联盟组建，调度和运动计划。 D-Itag通过两个关键特征实现弹性：i）交错执行，ii）有针对性的维修。 \ textIt {交错执行}可以在每一层进行有效搜索解决方案，同时避免与其他层不兼容。 \ textIt {目标修复}识别并修复了现有解决方案的一部分，该解决方案在保存其余部分的同时受到给定破坏的影响。除了算法贡献外，我们还提供理论上的见解，以了解这些设置中时间和资源最优性之间固有的权衡，并在计划次级临时性上得出有意义的界限。我们的实验表明，在动态设置中，i）d-itag的速度明显比从头开始的重新计算要快得多，而溶液质量几乎没有损失，ii）理论次优界在实践中始终保持。

translated by 谷歌翻译

Accelerating Online Reinforcement Learning via Supervisory Safety Systems

Benjamin Evans , Johannes Betz , Hongrui Zheng , Herman A. Engelbrecht , Rahul Mangharam , Hendrik W. Jordaan

分类：机器人

2022-09-22

深度强化学习（DRL）是一种仅从演示和经验中学习机器人控制政策的有前途的方法。为了涵盖机器人的整个动态行为，DRL训练是通常在仿真环境中得出的主动探索过程。尽管这种模拟培训廉价且快速，但将DRL算法应用于现实世界的设置很困难。如果对代理进行训练直到它们在模拟中安全执行，则由于模拟动力学和物理机器人之间的差异引起的SIM到真实差距，将其传输到物理系统很困难。在本文中，我们提出了一种在线培训DRL代理的方法，可以使用基于模型的安全主管在实体车辆上自动驾驶。我们的解决方案使用监督系统检查代理选择的操作是安全还是不安全，并确保在车辆上始终采取安全措施。这样，我们可以在安全，快速，有效地训练DRL算法的同时绕过SIM到现实的问题。我们提供各种现实世界实验，在线培训一辆小型实体车辆，可以自动驾驶，没有事先模拟培训。评估结果表明，我们的方法在未崩溃的同时提高了样品效率的训练代理，并且受过训练的代理比在模拟中训练的代理表现出更好的驾驶性能。

translated by 谷歌翻译

Reconstruction of Long-Term Historical Demand Data

Reshmi Ghosh , Michael Craig , H. Scott Matthews , Constantine Samaras , Laure Berti-Equille

分类：机器学习

2022-09-10

强大的电力系统的长期计划需要了解不断变化的需求模式。电力需求对天气敏感。因此，引入间歇性可再生能源的供应方面变化与可变需求并列，将在网格计划过程中引入其他挑战。通过了解美国温度的空间和时间变化，可以分开需求对自然变异性和与气候变化相关的影响的需求的响应，尤其是因为尚不清楚由于前一个因素所产生的影响。通过该项目，我们旨在通过开发机器和深入学习“背面销售”模型来更好地支持电力系统的技术和政策开发过程，以重建多年需求记录并研究温度的自然变异性及其对需求的影响。

translated by 谷歌翻译